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Abstract This paper presents a canonical duality theory for solving nonconvex mini- 
mization problem of Rosenbrock function. Extensive numerical results show that this 
benchmark test problem can be solved precisely and efficiently to obtain global optimal 
solutions. 
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Nonconvex minimization problem of Rosenbrock function, introduced in [15] . is a 
benchmark test problem in global optimization that has been used extensively to test 
performance of optimization algorithms and numerical approaches. The global mini- 
mizer of this function is located in a long, deep, narrow, parabolic/banana shaped flat 
valley (Figured]). 



Rosenbrock 




Figure 1: 2-dimensional Rosenbrock function (www2.imm.dtu.dk/courses/02610/) 
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Although to find this valley is trivial for most cases, to accurately locate the 
global optimal solution is very difficult by almost all gradient-type methods and some 
derivative-free methods. Due to the nonconvexity, it can be easily tested that if the 
initial point is chosen to be (3,3, ... ,3), direct algorithms are always trapped into a 
local minimizer for problems with dimensions n = 5 ~ 7 as well as n > 4000; if the 
initial point is chosen at (100, 100, . . . , 100), iterations will be stopped at a local min 
with the objective function value > 47.23824896 even for a two-dimensional problem. 
This paper will show that by the canonical duality theory, this well-known benchmark 
problem can be solved efficiently in an elegant way. 

The canonical duality theory was originally developed in nonconvex/nonsmooth 
mechanics [9j. It is now realized that this potentially powerful theory can be used for 
solving a large class of nonconvex/nonsmooth/discrete problems [TOl IT2"]. In this short 
research note, we will first show the nonconvex minimization problem of Rosenbrock 
function can be reformulated as a canonical dual problem (with zero duality gap) and 
the critical point of the Rosenbrock function can be analytically expressed in terms of 
its canonical dual solutions. Both global and local extremal solutions can be identified 
by the triality theorem. Extensive numerical examples and discussion are presented in 
the last section. 

2 Primal Problem and Its Canonical Dual 

The primal problem is 

{rt-l 
P( X ) = 
i=l 

where x = {x{\ G X = MJ 1 is a real unknown vector, a = 2N and iV is a given real 
number. Clearly, this is a nonconvex minimization problem which could have multiple 
local minimizers. 

In order to use the canonical duality theory for solving this nonconvex problem, we 
need to define a geometrically admissible canonical measure 

H = {Q = {x 2 l -x l+1 }e£ a cR n - 1 . (2) 



2\2 



(xi-1) +-a(x i+1 —x i ) 



x G X 



(1) 
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The canonical function V : S a — > R can be defined by 



(3) 



i=i 



which is a convex function. The canonical dual variable <; = can be defined uniquely 
by 

« = fe} = W(0 = {<}• (4) 



Therefore, by the Legendre transformation, the conjugate function V* : S = 
is obtained as 

n— 1 1 



on— 1 



(5) 



Replacing J^iLi* I^C^H-i ~~ x f) 2 by the Legendre equality V(A(x)) = A(x) T <? — V*(s), 
the total complementary function S : # x 5 — >■ R is given by 



n-l 



S(x, S )=E 



i=l 



(Xi - l) 2 + q(x 2 - - V 2 



(6) 



Let 5" and 5 b be shifting operators such that = ft+i and 5 b Q = ft_i. We define 



5 b <;i = 0. Then on the canonical dual feasible space 

S a = {<; e S\ ft + 1 ^ Vi = 1, . . . ,n - 2, = 0}, 
the canonical dual can be obtained by 



n-l 



P d (s) = sta{S(x, q)\xeX} = n-l-^2 



i=i 



4( ft + l) + 2 a * 



(7) 



(8) 



Based on the complementary- dual principle proposed in the canonical duality theory 
(see [?]), we have the following result. 



Theorem 1 If q is a critical point of P d (q), then the vector x = {xi} defined by 

6% + 2 



-, % 1, • • • , U 1, X n % n —i 



2(^ + 1)' 
is a critical point of -P(x) and 

P(x) = S(x,§) = P«'(§). 



(9) 



(10) 
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This theorem presents actually an "analytic" solution form to the Rosenbrock function, 
i.e. the critical point of the Rosenbrock function must be in the form of ([9]) for each 
dual solution ft The first version of this analytical solution form was presented in 
nonconvex variational problems in phase transitions and finite deformation mechanics 
[3 [6J [7]. The extremality of the analytical solution is governed by the so-called triality 
theory. Let 

S+ = {<; e S a \ ft + 1 > Vi = 1, . . . , n - 1}, (11) 
we have the following theorem: 

Theorem 2 Suppose that q is a critical point of P d (<;) and the vector x = {x{\ is 
defined by Theorem 1. 

IfsE <S+, then q is a global maximal solution to the canonical dual problem on S£ , 
i.e., 

(Vi): max{P%) | * e <S+}, (12) 
the vector x is a global minimal to the primal problem, and 

P(x) = minP(x) = max P d (<?) = P rf (ft). (13) 

Theorem 2 shows that the canonical dual problem (Vt) provides a global optimal solu- 
tion to the nonconvex primal problem. Since (Vf) is a concave maximization problem 
over a convex space which can be solved easily. This theorem is actually a special 
application of Gao and Strang's general result on global minimizer in in nonconvex 
analysis [T5] . 

By introducing 

S~ = S/S+ = {<? G R"- 1 ! ft + 1< Vi = 1, . . . ,n - 1}, (14) 
recently the triality theory (see [H]) leads to the following theorem. 

Theorem 3 Suppose that q is a critical point of P d (<?) and the vector x = {xi} is 
defined by Theorem 1. 

If s G S~ , then on a neighborhood X x S C X x S~ of (x, <;), we have either 

P(x) = min P(x) = min P d («j 

or 

P(x) = maxP(x) = max? d k 



(15) 
(16) 
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The proof of this Theorem can be derived from the recent paper by Gao and Wu [H] . 
By the fact that the canonical dual function is a d.c. function (difference of convex 
functions) on S~, the double- min duality ( 115]) can be used for finding the biggest local 
minimizer of the Rosenbrock function -P(x), while the double-max duality ( 116]) can be 
used for finding the biggest local maximizer of -P(x). In physics and material sciences, 
this pair of biggest local extremal points play important roles in phase transitions. 

Because <j n _i = 0, we may know that S~ is an empty set. Thus, by Theorem 3 in 
this paper we cannot find a local maximizer or minimizer on S~ or its subset for P d (s). 

3 Numerical Examples and Discussion 

(V) and (Vf) will be solved by the discrete gradient (DG) method (0), which is a local 
search optimization solver for nonconvex and/or nonsmooth optimization problems. In 
two dimensional space, Rosenbrock function has a long ravine with very steep walls and 
flat bottom; "because of the curved flat valley the optimization is zig-zagging slowly 
with small stepsizes towards the minimum" (en.wikipedia.org/wiki/Gradient_descent). 
This means any gradient method may fail to minimize the Rosenbrock function even 
from 2 dimensions. The DG method is a derivative-free method which can be applied 
for miminizing/maximizing Rosenbrock function and its dual. Numerical experiments 
have been carried out in Intel(R) Celeron(R) CPU 900@2.20GHz Windows Vista™ 
Home Basic personal notebook computer. 

We try iV=100 (when N—10 we find the numerical results are similar to N — 100), 
with the dimensions n=2~10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 
700, 800, 900, 1000, 2000, 3000, 4000. We first set (3, 3, ... , 3) (called seedl) as the 
initial solution for (V) (usually the feasible solution space is a box constrained between 
-2.048 and 2.048 [H CEEl [17] ) . Numerical results (Table 1) show that to solve the primal 
problem (V), the DG method can easily and quickly get approximate global minimum 
solution to x = (1, 1, . . . , 1) with the approximate global optimal values at -P(x) = 0, 
except for n=5~7, 4000, where the DG method can only get a local minimum solution 
x = (-1, 1, . . . , 1) with P(x) = 4. Then we let x = (100, 100, . . . , 100) (called seed2) 
be the initial solution for (V), searched in the intervals —500 < Xj < 500, i — 1, 2, . . . , n. 
We find that the DG method was trapped into local optimal solutions but not getting 
any global minimum at all, even from a 2 dimensional problem (see Table 2), its objec- 
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tive function value is 47.23824896. However, from Table 2 we can see that by the same 
DG method, the global maximum of the dual problem can be obtained very elegantly. 

For (V+), the corresponding dimensions are 1~9, 19, 29, 39, 49, 59, 69, 79, 89, 99, 
199, 299, 399, 499, 599, 699, 799, 899, 999, 1999, 2999, 3999. The initial solution is 
set as £ = (— 2/3, — 2/3, . . . , — 2/3, 0) (called seedl), the constraints <jj + 1 > 0,i = 
1, 2, . . . , n — 1 were penalized into the objective function; by H(x, <;)' Xn = of formula 
(6), we can set the values of the last variable q n ^\ always being (> —1). With these 
numerical computation settings, the DG method can easily and quickly solve all these 
test problems to accurately get a global maximizer <; = (0, 0, ...,0) with the optimal 
value P d (s) = (Table 1). By the fact that the canonical dual problem (7-+) is a 
concave maximization over a convex open space, the DG method was not trapped into 
any local optimal solution. But, for the nonconvex primal problem (V) in dimensions 
n=5~7 and 4000, the DG method was trapped into local minimizer x = (—1, 1, . . . , 1). 
If we set the initial solution as s = (100, 100, . . . , 100, 0) (called seed2) and repeat the 
calculations, our numerical results (Table 2) show again that the canonical dual prob- 
lem can be easily and quickly solved by the DG method to accurately get the global 
maximizer ^ = (0, 0, 0) with the optimal solution P d (s) = for dimensions n = 1 ~ 
9, 19, 29, 39, 49, 59, 69, 79, 89, 99, 199, 299, 399, 499, 599, 699, 799, 899, 999, 1999. 

The comparisons between (V) and (P+) in view of total number of iterations and 
total number of objective function evaluations (i.e. function calls) are listed in Tables 
[T]l2l Compared with (V+), the approximate global and local optimal solutions and their 
optimal objective function values of (V) are not accurate, and even cannot be obtained 
if the initial iteration is set to be x = (100, 100, . . . , 100). In Table [TJ we can see that 
the total number of iterations and function calls for (V) are always greater than those 
for (P+). This means that costs less computer calculations than (V), though (P+) 
still can get accurate global optimal solutions and the global optimal objective function 
value. The initial solutions x = (100, 100, . . . , 100) and <j = (100, 100, . . . , 100, 0) re- 
spectively for (V) and (7-+) are not practical for real numerical tests so that the total 
number of iterations and function calls of (V) are sometimes less than those of (Vl). 
Regarding the CPU times for solving (Vf) with n = 4000, the largest CPU time for 
seedl is 206.3581 seconds (i.e. 3.4393 minutes). 

Example 1. Let n = 4 (four dimensions). The global minimizer is known to be 
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Table 1: Results of numerical experiments for (V) and (7-+): N = 100, seedl 



Dimension n 


Iterations 


Function calls 


Objective function value 


(V) 




(V) 


(V d + ) 


(V) 


(H) 


2 


120 


24 


2843 


28 


0.00001073 


0.00000000 


3 


422 


26 


8996 


137 


0.00401438 


0.00000000 


4 


3737 


35 


48352 


202 


0.00615273 


0.00000000 


5* 


335 


34 


10179 


399 


3.96077434 


0.00000000 


6* 


2375 


44 


43770 


868 


4.00635895 


0.00000000 


7* 


1223 


53 


28009 


1625 


4.09419146 


0.00000000 


8 


2160 


55 


46792 


2100 


0.01246714 


0.00000000 


9 


2692 


51 


61017 


2526 


0.01397307 


0.00000000 


10 


4444 


63 


91470 


3979 


0.01055630 


0.00000000 


20 


3042 


55 


140924 


10084 


0.00940077 


0.00000000 


30 


2321 


58 


133980 


20515 


0.01075478 


0.00000000 


40 


1659 


60 


173795 


26818 


0.01227866 


0.00000000 


50 


2032 


57 


219233 


36459 


0.01264147 


0.00000000 


60 


1966 


61 


260701 


50495 


0.01048188 


0.00000000 


70 


1876 


56 


272919 


52545 


0.01531147 


0.00000000 


80 


1405 


61 


195156 


59684 


0.01594730 


0.00000000 


90 


2142 


61 


371963 


71320 


0.01055831 


0.00000000 


100 


2676 


60 


510722 


70208 


0.01125514 


0.00000000 


200 


1395 


61 


653604 


188589 


0.01115318 


0.00000000 


300 


1368 


60 


882760 


235163 


0.01574873 


0.00000000 


400 


2085 


66 


1869675 


301805 


0.00928066 


0.00000000 


500 


1155 


59 


1394240 


358938 


0.01168440 


0.00000000 


600 


1226 


63 


1808285 


451817 


0.00918730 


0.00000000 


700 


1557 


60 


2134359 


559378 


0.01257100 


0.00000000 


800 


1398 


61 


2098062 


522726 


0.01442714 


0.00000000 


900 


716 


65 


1904187 


763449 


0.01074534 


0.00000000 


1000 


1825 


61 


3598608 


681509 


0.00897202 


0.00000000 


2000 


257 


62 


2087277 


1455472 


0.00937219 


0.00000000 


3000 


3221 


60 


20642543 


2714296 


0.01250373 


0.00000000 


4000* 


679 


60 


7581502 


3659292 


4.11193171 


0.00000000 
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Table 2: Results of numerical experiments for (V) and (7-+): N = 100, seed2 



Dimension n 


Iterations 


Function calls 


Objective function value 


(V) 




(V) 




(V) 


(V d + ) 


2 


10013 


24 


227521 


28 


47.23824896 


0.00000000 


3 


144 


32 


4869 


235 


96.49814330 


0.00000000 


4 


144 


81 


5279 


938 


82.46230602 


0.00000000 


5 


148 


137 


5682 


1768 


94.19254867 


0.00000000 


6 


154 


166* 


6238 


2590 


88.84382963 


0.00000000 


7 


159 


179* 


7097 


3288 


237.63078399 


0.00000000 


8 


165 


202* 


7502 


4300 


238.41126013 


0.00000000 


9 


153 


206* 


7137 


5083 


84.54205412 


0.00000000 


10 


162 


216* 


7491 


5920 


83.23094398 


0.00000000 


20 


225 


285* 


19111 


17458 


83.94779152 


0.00000000 


30 


216 


301* 


20939 


28543* 


156.95838274 


0.00000000 


40 


163 


291* 


19775 


40444* 


83.30960344 


0.00000000 


50 


158 


298* 


33269 


51888* 


85.93091895 


0.00000000 


60 


158 


312* 


34094 


61767* 


89.07412094 


0.00000000 


70 


162 


284* 


35436 


69865* 


92.45725362 


0.00000000 


80 


209 


297* 


35607 


89127* 


157.69955825 


0.00000000 


90 


227 


294* 


60398 


98748* 


82.44035053 


0.00000000 


100 


202 


290* 


57792 


102796* 


81.94595276 


0.00000000 


200 


1826 


262 


436413 


189293 


83.77165551 


0.00000000 


300 


195 


259* 


169238 


261320* 


152.95671738 


0.00000000 


400 


195 


278* 


212104 


375816* 


82.49253919 


0.00000000 


500 


190 


297* 


331637 


522695* 


82.40170647 


0.00000000 


600 


292 


303* 


431092 


559068* 


150.15456693 


0.00000000 


700 


189 


275* 


383735 


758631* 


89.14575473 


0.00000000 


800 


198 


270* 


429674 


701053* 


84.50538257 


0.00000000 


900 


198 


280* 


416150 


867398* 


85.32757049 


0.00000000 


1000 


193 


283* 


445326 


930761* 


89.48369379 


0.00000000 


2000 


232 


310* 


1123240 


2030104* 


84.26810981 


0.00000000 



x= (1,1,1,1) andP(x) =0. 

Solution: By using the DG method for both primal problem (V) and its canonical 
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dual we have the numerical solutions 

x = (1.0166873133, 1.0337174892, 1.0687306765, 1.1425101552), P(x) = 0.00615273, 

q = (0.0000000119, 0.0000000000, 0.0000000000), P*(q) = 0.00000000. 
This shows that the canonical dual problem provides more accurate solution. 

Example 2. For dimension n — 5, the Rosenbrock function has exactly two 
minima, one is the global optimal solution (1,1,1,1,1) with global optimal minimum 
value 0, and another minimum is a local minimum near (—1, 1,1,1, 1) with local optimal 
minimum value 4. 

Solution: By the DG method, the primal solution is 

x = (-0.9856129203, 0.9814803343, 0.9682775584, 0.9398661046, 0.8840549028) 

with .P(x) = 3.96077434. Clearly, this is a local minimizer. While the canonical dual 
problem produces accurately a global optimal solution 

q = (0.0000004388, 0.0000006036, 0.0000000000, 0.0000000000), P*{q) = 0. 

Example 3. For n = 6 (six dimensions), the Rosenbrock function has exactly two 
minima, i.e., the global optimal solution 

x 1 = (1,1, 1,1, 1,1), P(xx)=0, 

and local minimal solution 

x 2 = (-1,1, 1,1, 1,1), P(x 2 )=4. 

Solution: To solve the primal problem directly, the DG method can only provide local 
solution 

x = (-0.9970726441, 1.0041582933, 1.0133158817, 1.0292928527, 1.0607123926, 1.1258344785) 
with -P(x) = 4.00635895. For the canonical dual problem, the DG method produces 
q = (0.0000001747, -0.0000000559,0.0000005919,0.0000000000,0.0000000000), 

Pf (?) = o. 
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Example 4. Similarly, if n — 7, the test problem has the same global optimal 
solution 

x 1 = (1,1, 1,1, 1,1,1), P(xO = 
and the local minimal solution 

x 2 = (-1,1, 1,1, 1,1,1), P(x 2 ) = 4. 

Solution: By the DG method, we have 

x = (-1.0003403494,1.0106728675,1.0264433859,1.0561180077, 
1.1168007274, 1.2483026410, 1.5594822181), 
P(x) = 4.09419146, 

q = (-0.0000001431,-0.0000011147,-0.0000010643,-0.0000003284, 
0.0000000000, 0.0000000000), 
P}® = 0. 

This shows again that the DG iterations for solving the primal problem is trapped to 
a local min. 

Example 5. Now we let n = 4000. The Rosenbrock function has many minima. 
The global optimal solution is still xi = (1, . . . , 1) with P(x) = 0. One of local minima 
is nearby the point x 2 = (—1, 1, . . . , 1) with P(x 2 ) = 4. 

Solution: Again, by the DG method, the primal iteration is trapped at 

x = (-0.9932861006, 0.9966510741, . . . , 1.3122885708, 1.7233744896), P(x) = 4.11193171. 

The conical dual solution is 

q = (-0.0000000314, -0.0000000040, -0.0000000437, 

-0.0000000281, 0.0000000008, -0.0000000214, 0.0000000000, 0.0000000000), 

which produce precisely the optimal value P+(s) = 0. Indeed, as long as n > 5, the 
DG method is always trapped into the local minimizer x = (—1, 1, . . . , 1) if the initial 
solution is set to be x = (-1.0005, 1.0005, . . . , 1.0005). 
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It is worth to note that both -P(x) and P d (q) are the sum of n — 1 items. This 
is convenient for MPI (Message Passing Interface) parallel computations. We may 
broadcast (MPLBcast) the sum to n — 1 processes, each process calculates one item, 
and at last all the partials are reduced (MPLReduce) onto one process to get the sum. 



Thus on Tambo machines of VLSCI (http://www.vlsci.unimelb.edu.au) we should be 



able to successfully solve (JTJ and f lT2"j) with at least 3.2767 x 10 7 variables if setting 
the maximal variables for the DG method to be 4000 (though the DG method and 
its parallelization version ([3]) can solve optimization problems with more than 4000 
variables). The successfully tested MPI code is followed: 

broadcast n — 1 

call MPLBCAST (n - 1,1,MPIJNTEGER, 0, MPLCOMM_WORLD , ierr) 
check for quit signal 

if ( n - 1 .le. ) goto 30 
calculate every partials 

sum = O.OdO 

do 20 i = myid+1, n — 1, numprocs 

sum = sum +(x(i) — 1.0) * *2 + 100.0 * (x(i) * *2 — x{% + 1)) * *2 
20 continue (for -P(x)) 
do 20 i = myid+1, n — 1, numprocs 

if [i - 1 .eq. 0) then ?(0)=0 

sum = sum +(<j(z - 1) + 2.0)/(4 * (q(i) + 1.0)) + (1.0/400.0) * * *2 
20 continue (for P a! (<?)) 
f = sum 

collect all the partial sums 

call MPLREDUCE (f,objf,l,MPLDOUBLE_PRECISION, MPLSUM, 0, 
& MPLCOMM_WORLD, ierr ) 

30 node (i.e. myid = 0) prints the sums = objf 
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4 Conclusion 



This research note demonstrates a powerful application of the canonical duality theory 
for solving the nonconvex minimization problem of Rosenbrock function. Extensive 
numerical computations show that by using the same DG method, the canonical dual 
problem can be easily solved to produce global solutions. 
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